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Abstract. Significant efforts have gone into the development of statis- 
tical models for analyzing data in the form of networks, such as social 
^^ networks. Most existing work has focused on modeling static networks, 

which represent either a single time snapshot or an aggregate view over 
time. There has been recent interest in statistical modeling of dynamic 
networks, which are observed at multiple points in time and offer a richer 
representation of many complex phenomena. In this paper, we propose 
^> a state-space model for dynamic networks that extends the well-known 

^^ stochastic hlockmodel for static networks to the dynamic setting. We then 

^SJ propose a procedure to fit the model using a modification of the extended 

^N) Kalman filter augmented with a local search. We apply the procedure to 

analyze a dynamic social network of email communication. 
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^ Many complex physical, biological, and social phenomena are naturally repre- 

■^ sented by networks. Tremendous efforts have been dedicated to analyzing net- 

r^ work data, which has led to the development of many formal statistical models 

for networks. Most research has focused on static networks, which either repre- 
sent a single time snapshot of the phenomenon being investigated or an aggregate 
~^ view over time. As such, statistical models for static networks have a long history 

J^ in statistics and sociology among other fields 0. However, most complex phe- 

I nomena, including social behavior, are time-varying, which has led researchers 

;• to consider dynamic, time-evolving networks. 

. !^ In this paper, we consider dynamic networks represented by a sequence of 

^ snapshots of the network at discrete time steps. We characterize such networks 

JH using a set of unobserved time-varying states from which the observed snapshots 

are derived. We propose a state-space model for dynamic networks that combines 
two types of statistical models: a static model for the individual snapshots and a 
temporal model for the evolution of the states. The network snapshots are mod- 
eled using the stochastic blockmodel [5j, a simple parametric model commonly 
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used in the analysis of static social networks. The state evolution is modeled by 
a stochastic dynamic system. Using a Central Limit Theorem approximation, we 
develop a near-optimal procedure for fitting the proposed model in the on-line 
setting where only past and present network snapshots are available. The infer- 
ence procedure involves a modification of the extended Kalman filter, which is 
used for state tracking in many applications |3], augmented with a local search 
strategy. We apply the proposed procedure to analyze a dynamic social network 
of email communication and predict future email activity. 

2 Related work 

Several statistical models for dynamic networks have previously been proposed 
by extending a static model to the dynamic setting in a similar fashion to our 
proposed model '21 . Two such models include temporal extensions of the expo- 
nential random graph model jlj and latent space model [T3] . More closely related 
to the state-space model we propose are several temporal extensions of stochas- 
tic blockmodels (SBMs). SBMs divide nodes in the network into multiple classes 
and generate edges independently with probabilities Oab dependent on the class 
memberships a, b of the nodes [5]. Yang et al. [15] propose a dynamic SBM in- 
volving a transition matrix that specifies the probability that a node in class i at 
time t switches to class j at time i -t- 1 for all i, j, t and fit the model using Gibbs 
sampling and simulated annealing. Ho et al. [Ij propose a temporal extension 
of a mixed-membership version of the SBM using linear state-space models for 
the class membership vectors of node clusters. One major difference between 
[H [15] and this paper is that we treat the edge probabilities Oab as time-varying 
states, while [H I15j treat them as time-invariant parameters. In addition, our 
model allows for a simpler inference procedure using a Central Limit Theorem 
approximation. We demonstrate the importance of the time-varying states for 
analysis of a dynamic social network in Section [5] 

3 Static stochastic blockmodels 

We first introduce notation and summarize the static stochastic blockmodel 
(SSBM), which we use as the static model for the individual network snap- 
shots. We represent a dynamic network by a time-indexed sequence of graphs, 
with W* = [w\j\ denoting the adjacency matrix of the graph observed at time 
step t. wjj = 1 if there is an edge from node i to node j at time i, and w*^- = 
otherwise. We assume that the graphs are directed, i.e. w*^ ^ i(;*j in general, and 
that there are no self-edges, i.e. w*j = 0. PF^*) denotes the set of all snapshots 
up to time s, {W^, W''~^, . . . , W^}. The notation i G a indicates that node i is 
a member of class a. \a\ denotes the number of nodes in class a. The classes of 
all nodes at time t is given by a vector c* with c* = a if i e a at time t. We 
denote the submatrix of W^ corresponding to the relations between nodes in 
class a and class b by W^[ai[bi ■ We denote the vectorized equivalent of a matrix X, 
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i.e. the vector obtained by simply stacking columns of X on top of one another, 
by X. Doubly-indexed subscripts such as Xij denote entries of matrix X, while 
singly-indexed subscripts such as Xi denote entries of the vectorized equivalent 

X. 

Consider a snapshot at an arbitrary time step t. An SSBM is parameterized 
by a A: X /c matrix 6** — [6**^], where 6*^^ denotes the probability of forming 
an edge between a node in class a and a node in class 5, and k denotes the 
number of classes. The SSBM decomposes the adjacency matrix into fc^ blocks, 
where each block is associated with relations between nodes in two classes a and 
h. Each block corresponds to a submatrix W^r'^iiry of the adjacency matrix W* . 
Thus, given the class membership vector c*, each entry of W* is an independent 
realization of a Bernoulli random variable with a block-dependent parameter; 
that is, w\j ~ Bernoulhl 0*t^f j . 

SBMs are used in two settings: 

1. The a priori blockmodeling setting, where class memberships are known or 
assumed, and the objective is to estimate the matrix of edge probabilities 0*. 

2. The a posteriori blockmodeling setting, where the objective is to simultane- 
ously estimate 6>* and the class membership vector c*. 

Since each entry of W^ is independent, the likelihood for the SBM is given by 



/(w-*;^')=n(C) "(i-C. 
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e^P E E Kb log (dL) + Kt - rnl) log (1 - Ol,)] , (1) 



,a=l b=l 



where to^j, ~ 12iea^jeb'^ij denotes the number of observed edges in block 
(a, h), and 



, _ ,\a\\b\ a^b 
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denotes the number of possible edges in block (a, b) 6J. The parameters are given 
by ^* = 6** in the a priori setting, and <?* = {6**, c*} in the a posteriori setting. 
In the a priori setting, a sufficient statistic for estimating 0* is the matrix Y* 
of block densities (ratio of observed edges to possible edges within a block) with 
entries t/*^ — m^f,/?!^^. F* also happens to be the maximum-likelihood estimate 
of 0*, which can be shown [5^ by setting the derivative of the logarithm of (IT]) 
to 0. 

Estimation in the a posteriori setting is more involved, and many methods 
have been proposed, including Gibbs sampling fS" , label-switching [6j [16] , and 
spectral clustering [T^. The label-switching methods use a heuristic for solving 
the combinatorial optimization problem of maximizing the likelihood (fTl) over the 
set of possible class memberships, which is too large to perform an exhaustive 
search. 
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4 Dynamic stochastic blockmodels 

We propose a state-space model for dynamic networks that consists of a tem- 
poral extension of the static stochastic blockmodel. First we present the model 
and inference procedure for a priori blockmodeling, and then we discuss the ad- 
ditional steps necessary for a posteriori blockmodeling. The inference procedure 
is on-line, i.e. the state estimate at time t is formed using only observations from 
time t and earlier. 

4.1 A priori blockmodels 

In the a priori SSBM setting, Y* is a sufficient statistic for estimating 0* as 
discussed in Section |3] Thus in the a priori dynamic SBM setting, we can equiv- 
alently treat Y* as the observation rather than W*. The entries of W^uirfti are 
independent and identically distributed (iid) Bernoulli (^^b); ^^^^ ^y the Central 
Limit Theorem, the sample mean y*;, is approximately Gaussian with mean 0^^, 
and variance (cr*^)^ = ^abi^~^ab)/''^ab^ where n^^ was defined in ^. We assume 
that y*(, is indeed Gaussian for all (a, b) and posit tlie linear observation model 

where Z* is a zero-mean iid Gaussian noise matrix with variance (cr*^)^ for the 
(a, 6)th entry. 

In the dynamic setting where past snapshots are available, the observations 
would be given by the set F'*'. The set Q*^*-* can then be viewed as states of a 
dynamic system that is generating the noisy observation sequence. We complete 
the model by specifying a model for the state evolution over time. Since 9*^^^ is 
a probability and must be bounded between and 1, we instead work with the 
matrix iZ'* = [tplf,] where i/'aj, = log(^ab) ~ log(l — 0*^), the logit of 9l^. A simple 
model for the state evolution is the random walk 

where ■j/'* is the vector representation of the matrix l?"*, and v* is a random 
vector of zero-mean Gaussian entries, commonly referred to as process noise, with 
covariance matrix _r* . The entries of the process noise vector are not necessarily 
independent or identically distributed (unlike the entries of Z*) to allow for states 
to evolve in a correlated manner. The observation model can then be written in 
terms of ■0* a^ 

y'^h{xP')+z\ (3) 

where the function h : R'^' — > M*^ is defined by hi{x) — 1/(1 + e~^'), i.e. the 
logistic function applied to each entry of x. We denote the covariance matrix of 
z* by S*, which is a diagonal matrisFlwith entries given by (c^;,)^. A graphical 
representation of the proposed model for the dynamic network is shown in Fig. [11 

^ Note that we have converted the block densities Y* and observation noise Z* to their 

respective vector representations i/' and z*. 
^ The indices (a, b) for (cr*^)^ are converted into a single index i corresponding to the 

vector representation a*. 
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Fig. 1. Graphical representation of the proposed model. The rectangular boxes denote 
observed quantities, and the ovals denote unobserved quantities. The logistic SBM 
refers to applying the logistic function to each entry of !/'' to obtain 0* then generating 
W' using 6>* and c*. 



To perform inference on this model, we assume the initial state is Gaussian 



distributed, i.e. -0° - 7V(/i°,r°), and that {ip°,v^, . 



.,v\z^ 



,z*} are 



tually independent. If (pi) was linear in ■»/>*, then the optimal estimate of i/j' in 
terms of minimum mean-squared error would be given by the Kalman filter |3] . 
Due to the non-linearity, we apply the extended Kalman filter (EKF), which 
linearizes the dynamics about the predicted state and provides an near-optimal 
estimate of i/'*- The predicted state under the random walk model is simply 
-0*1*-! = '^t-Mt-i with covariance i?*l*-i = i?*-i|*-i -hT*. Let J* denote the Ja- 
cobian of h evaluated at the predicted state t^*l*~^. The EKF update equations 
are as follows [5]: 

Near-optimal Kalman gain: K* = i?*!*"^ (j*)^ [j'i?*!*^^ {,P)^ + S 



Posterior state estimate: 



jpt\t = ^t\t-i ^ j^t 



y 



/^(t/;*i*-i) 



Posterior estimate covariance: i?*'* = (/ - K^J^) i?*'* ^ 

The posterior state estimate i/^*'* provides a near-optimal fit to the model at 
time t given the observed sequence W^^\ How to choose the hyperparameters 
(/i°, r'^ , E'^ , r*) in an optimal manner is beyond the scope of this paper and is 
discussed in [HI chap. 5]. 



4.2 A posteriori blockmodels 

In many applications, the class memberships c* are not known a priori and must 
be estimated along with '?'*. This can be done using label-switching methods 
[SI [TS] , but rather than maximizing the likelihood, we maximize the posterior 
state density given the entire sequence of observations M^(*) up to time t to 
account for the prior information. This is done by alternating between label- 
switching and applying the EKF. 
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The posterior state density is given by 

/ (V* I W^**^) oc / (VF* I ip\ W^(*-i)) / (i/)* I VK(*-i)) . (4) 

By the conditional independence of current and past observations given the cur- 
rent state, VF'*"^-* drops out of the first term in Q. It can thus be obtained 
simply by substituting h{ip^) for 0* in (II]). The second term in Q is equivalent 
to / (i/)* I y*-*"^-*) because the class memberships at all previous time steps have 
already been estimated. By applying the Kalman filter to the linearized tem- 
poral model 0, / (V'* I y**"^^) ^ TV (t/i*l*-i, i?*l*-i). Thus the logarithm of the 
posterior density is given by 

log/(t/.* I M/(*)) = c - - (t/.* - -0*1*-!)^ (i?*l*-i)"' (t/>* - -0*l*-i) 

k k (5) 

+ E E {<b l°g [^ ^^'a,)] + {<b - <b) log [1 - h (^*,)] } , 
a=l b=l 

where c is a constant term independent of ■»/>* that can be ignorecPj 

We use the log-posterior ^ as the objective function for label-switching. We 
find that a simple local search (hill climbing) algorithm [TT] initialized using the 
estimated class memberships at the previous time step suffices, because only a 
small fraction of nodes change classes between time steps in most applications. 
At the initial time step, we employ the spectral clustering algorithm of Sussman 
et al. [12] for the SSBM as the initialization. 

5 Application to Enron email network 

We demonstrate the proposed procedure on a dynamic social network con- 
structed from the Enron corpus [SUTU], which consists of about 0.5 million email 
messages between 184 Enron employees from 1998 to 2002. We place directed 
edges between employees i and j at time t if i sends at least one email to j during 
week t. Each time step corresponds to a 1-week interval. We make no distinction 
between emails sent "to" , "cc" , or "bcc" . In addition to the email data, the roles 
of most of the employees within the company (e.g. CEO, president, manager, 
etc.) are available, which we use as classes for a priori blockmodeling. Employees 
with unknown roles are placed in an "others" class. 

5.1 State tracking 

We begin by examining the temporal variation of the states, which we refer to 
as state tracking. Recall that the states W^ correspond to the logit of the edge 
probabilities 6**. We first apply the a priori EKF to obtain the state estimates 
■0*1* and their variances (the diagonal of i?*'*). Applying the logistic function, we 
can then obtain the estimated edge probabilities 0*'* with confidence intervals. 

^ At the initial time step, i/S^'" = ^t° and i?^'" = T" -|- r\ 
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(b) Week 89: CEO Skilling resigns 



Fig. 2. Estimated edge probability matrices for two selected weeks. Entry {i,j) denotes 
the estimated probability of an edge from class i to class j. Classes are as follows: (1) 
directors, (2) CEOs, (3) presidents, (4) vice-presidents, (5) managers, (6) traders, and 
(7) others. Notice the increase in the probability of edges from CEOs during the week 
of Skilling's resignation. 
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Fig. 3. A priori EKF estimated edge probabilities 6^^^ (solid lines) with 95% confidence 
intervals (shaded region) for selected a, b by week. An increase in edge probabilities 
between Enron presidents (a) occurs prior to a similar increase between those in other 
roles (b) suggesting insider knowledge. 
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Examining the temporal variation of 6**'* reveals some interesting trends. For 
example, a large increase in the probabilities of edges from CEOs is found at 
week 89. This is the week in which CEO Jeffrey Skilling resigned and is confirmed 
to be the cause of the increased probabilities by examining the content of the 
emails. Fig. ^ shows a comparison of the matrix 0*'* during a normal week and 
during the week Skilling resigned. 

Another interesting trend is highlighted in Fig. [Sj where the temporal vari- 
ation of two selected edge probabilities over the entire data trace with 95% 
confidence intervals is shown. Edge probabilities between Enron presidents show 
a steady increase as Enron's financial situation worsens, hinting at more fre- 
quent and widespread insider discussions, while emails between others (not of 
one of the six known roles) begin to increase only after Enron falls under federal 
investigation. 

A key observation from this analysis is the importance of modeling the edge 
probabilities as time-varying states, as opposed to time-invariant parameters as 
in [H [TS] . Indeed the temporal variation of the edge probabilities is what reveals 
the internal dynamics of this time-evolving social network. Furthermore, the 
temporal model provides estimates with less uncertainty than the static SBM, 
with 95% confidence intervals that are 24% narrower on average. 

5.2 Dynamic link prediction 

Next we turn to the task of dynamic link prediction, which differs from static 
link prediction [7] because the link predictor must simultaneously predict the 
new edges that will be formed at time t + 1, as well as the current edges (as of 
time t) that will disappear at time t+l, from the observations W^*\ The latter 
task is not addressed by most static link prediction methods in the literature. 

Since the SBM assumes stochastic equivalence between nodes in the same 
class, the EKF alone is only a good predictor of the block densities F*, not 
the edges themselves. However, the EKF can be combined with a predictor 
that operates on individual edges to form a link predictor. A simple individual- 
level predictor is the exponentially-weighted moving average (EWMA) given by 
I^*+i = XW* + {1- X)W\ Using a convex combination of the EKF and EWMA 
predictors, we obtain a better link predictor that incorporates both block-level 
characteristics (through the EKF) and individual-level characteristics (through 
the EWMA). This can be seen from the receiver operating characteristic (ROC) 
curves in Fig. |4] The a posteriori EKF slightly outperforms the a priori EKF 
because the a posteriori EKF finds a better fit to the dynamic SBM via a better 
assignment of nodes to classes than the a priori (assumed) assignment. 

6 Conclusion 

This paper proposes a statistical model for dynamic networks that utilizes a set 
of unobserved time-varying states to characterize the dynamics of the network. 
The proposed model extends the well-known stochastic blockmodel for static 
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Fig. 4. Comparison of ROC curves for linlc prediction on Enron data. True positive 
rate denotes tlie fraction of actual edges that are correctly predicted, and false positive 
rate denotes the fraction of non-edges that are predicted to be edges. The convex com- 
bination of either EKF with the EWMA outperforms the EWMA alone by accounting 
for block-level characteristics. 



networks to the dynamic setting can be used for either a priori or a posteriori 
blockmodeling. The main contribution of the paper is a near-optimal on-line in- 
ference procedure for the proposed model using a modification of the extended 
Kalman filter, augmented with a local search. We applied the proposed inference 
procedure to the Enron email network and discovered some interesting trends 
when we examined the estimated states. One such trend was a steady increase in 
emails between Enron presidents as Enron's financial situation worsened, while 
emails between other employees remained at their baseline levels until Enron fell 
under federal investigation. In addition, the proposed procedure showed promis- 
ing results for predicting future email activity. We believe the proposed model 
and inference procedure can be applied to reveal the internal dynamics of many 
other dynamic networks. 
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